Using Grammatical Markov Models for Stylometric Analysis CS224N : Natural Language Processing
نویسندگان
چکیده
Researchers have tackled the problem of authorship attribution in several different ways, using various metrics to identify the author of an anonymous document given a set of writing samples from potential candidates. Common complaints about modern methodologies tend to accuse studies of content bias, which occurs when quantitative models identify similar content rather than similar styles. This artificially increases accuracy by producing good results on test data while failing to identify authors in real-world applications. We examine several quantitative methods that isolate style by using grammar-based features rather than relying on models of word shape and frequency.
منابع مشابه
Attention-based Recurrent Neural Networks for Question Answering
Machine Comprehension (MC) of text is an important problem in Natural Language Processing (NLP) research, and the task of Question Answering (QA) is a major way of assessing MC outcomes. One QA dataset that has gained immense popularity recently is the Stanford Question Answering Dataset (SQuAD). Successful models for SQuAD have all involved the use of Recurrent Neural Network (RNN), and most o...
متن کاملPati'ern Recognition Applied to the Acquisition of a Grammatical Classification System from Unrestricted English Text
Within computational linguistics, the use of statistical pattern matching is generally restricted to speech processing. We have attempted to apply statistical techniques to discover a grammatical classification system from a Corpus of 'raw' English text. A discovery procedure is simpler for a simpler language model; we assume a first-order Markov model, which (surprisingly) is shown elsewhere t...
متن کاملPattern Recognition Applied To The Acquisition Of A Grammatical Classification System From Unrestricted English Text
Within computational linguistics, the use of statistical pattern matching is generally restricted to speech processing. We have attempted to apply statistical techniques to discover a grammatical classification system from a Corpus of 'raw' English text. A discovery procedure is simpler for a simpler language model; we assume a first-order Markov model, which (surprisingly) is shown elsewhere t...
متن کاملStatistical Language Modeling Using Grammatical Information
We propose to investigate the use of grammatical information to build improved statistical language models. Until recently, language models were primarily innuenced by local lexical constraints. Today, language models often utilize longer range lexical information to aid in their predictions. All of these language models ignore grammatical considerations other than those induced by the statisti...
متن کاملCS224N Assignment 4: Reading Comprehension
Building question answering systems with deep learning is a significant application of solving the complex natural language problem of reading comprehension. In our approach, we have analyzed literature on previous work, implemented and improved on specific models described, and compared the various models to analyze the effects of certain aspects of models on performance on the question answer...
متن کامل